Method and apparatus for locating a pointing elements within a digital image

ABSTRACT

In order to work, the optical mouse device needs to analyze images (usually from a camera feed, but it is not limited to just a camera) which has a single background color. Weather broadcasts often have a blue or green background used as a matte, but the invented optical mouse patent does not require any specific background color.

BACKGROUND OF THE INVENTION

[0001] What is referred to generically herein as an optical mouse devicewas first created in 1995 after more than three months of research onhow to find a color ball within a video field. One area of particularinterest for use of an optical mouse is in television weatherbroadcasts. The talent, by using an optical mouse, could, for example,move a cloud shown on a display from one position to another by moving ahandheld pointing device with a colored ball at its end.

[0002] There were two problems:

[0003] 1. Because the digitalization of a video signal is too far apartfrom the raw video signal, it is not possible to find a specific colorwithin a full video field.

[0004] 2. On a system with a slow processor, it is not possible tosearch the whole image for a single point of color and let anotherprogram work in parallel.

[0005] Therefore, a way to find the end (or tip) of a person's hand (orof a pointer device) was chosen to bypass the problem. No specific coloris necessary to find the end. See U.S. Pat. No. 5,270,820 for adescription of this technique.

SUMMARY OF THE INVENTION

[0006] In order to work, the optical mouse device needs to analyzeimages (usually from a camera feed, but it is not limited to just acamera) which has a single background color. Weather broadcasts oftenhave a blue or green background used as a matte, but the inventedoptical mouse patent does not require any specific background color.

[0007] The object to be tracked (possibly a person, but it is notlimited to just persons, i.e. it could track a robot) will be standingwithin the received image and will have a different color than thebackground as shown in FIG. 1 and FIG. 2. The object may have partsusing that background color if some areas need to appear transparentthrough a matte, etc.

[0008] When the data is coming to the computer through a video signal,the computer will need a piece of hardware to convert the digitalrepresentation of the video signal supported by the optical mousedevice. Whatever the format of the images, it will have to be convertedto a one component (gray scale) or three components (red, green andblue) format. However, the way those are represented does not make anydifference on how the optical mouse works. (FIG. 2.)

[0009] Then the device will search an area of the image for a pointingdevice (optical mouse). The pointing device position will be returned tothe user of the device. The explanations found here can be used tosearch a person standing in front of a camera like, but not limited to,a weather caster. Such a search could work in any direction for any kindof object or person and is valid for any kind of extension.

[0010] In this specific case, the device starts its search at the bottomof the area where the body of a standing person should always be found(FIG. 3a). There are a few cases explained below when the device mayfind arms without finding the body of the person first.

[0011] Once the body of the person is found, a search for the arms isconducted (FIG. 3b). To do so, a search is started from the top of theimage and continues toward the bottom. If nothing is found, there is noextended arm.

[0012] Finally, a search is conducted along the arm up to the tip of thearm, i.e., end of the hand (FIG. 3c) or the pointing device (which willprolong the arm). This last search result is the position which theoptical mouse device returns to the user.

[0013] When both arms are extended, one of them must be selected to knowwhich is the one used to point (FIG. 3d). To accomplish this, simplyselect the one further away from the body: it is known where the bodywas found (Xb), and where the tip was found (Xt) therefore thedistance|Xb−Xt| (absolute value of the subtraction of both positions) isalso known; the longest distance determines which of the arms is thefurthest away. Other methods could be used. Some optical mouse devicescould be programmed to always return a position on the left side.

[0014] If no arm tip is found, a “no position” is returned.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015]FIG. 1 shows a video field with a background and foregroundobject.

[0016]FIG. 2 shows a video field having a limited search area.

[0017]FIG. 2a shows a pixel being checked and its surrounding pixels.

[0018]FIG. 3a shows a search for a body portion.

[0019]FIG. 3b shows a search for arm portions.

[0020]FIG. 3c shows a search for a tip.

[0021]FIG. 3d shows a search for a desired arm portion.

[0022]FIG. 3e shows a detailed search for the tip.

[0023]FIG. 4 shows a special case for a body and arm search.

[0024]FIG. 5 shows a special case for an arm search.

DETAILED DESCRIPTION OF THE INVENTION

[0025] The following is a description of a special case referring toarms. These arms are described as the left arm and the right arm. The‘left’ and ‘right’ words are not referencing the real left arm or rightarm of the person standing in the video field. Rather, these arereferences to the extension seen on the left and the extension seen onthe right as presented in the figures. See FIG. 3d.

[0026] 1. Full Image

[0027] The computer needs a source to receive a full digital image. Thissource can be, and is not limited to: (1) the Internet, (2) a movie or(3) a direct video feed to the computer, in which case the video signalwill need to be converted to a digital representation (FIG. 1.).

[0028] The full image sizes are represented as the variables WIDTH andHEIGHT.

[0029] 2. Area to be Tracked

[0030] The area to be tracked can be any rectangle 11 within the fullvideo field. The definition is (X-LEFT, Y-TOP, X-RIGHT, Y-BOTTOM) inpixel coordinates. These coordinates are used to determine where tosearch for the body 13 and arms 15 as shown in FIG. 1. The origin ofthese coordinates is the top-left corner, with the Y coordinatesincreasing going down and the X coordinates increasing going right.

[0031] 3. Search Limit

[0032] Depending on the type of source, it may be necessary to define alimit 19 to avoid searching near the borders (FIG. 2.), thus limitingproblems while searching. Most of the time, a video feed will haveartifacts on the border. The search limit can also be generated with asmaller area to be tracked. However, a full image version of the opticalmouse may only relate on a search limit.

[0033] The limit variable is defined as LIMIT and represents the numberof pixels to skip on all borders. Note that four values could be definedto give a specific left, right, top and bottom limit. For the purpose ofthis discussion, we will limit ourselves to only one value.

[0034] 4. Background

[0035] The background 21 needs to be a single color; i.e. always use thesame color. The color will possibly vary in intensity and chroma withina limited range defined by the user.

[0036] The color definitions will vary depending on the number ofcomponents of the digital representation of the image. Yet, eachcomponent can be checked against a range. A range will be defined as MINfor the lowest possible value and MAX for the highest possible value.These ranges may be defined as constants (the background is always ofthe same color in all the possible cases) or variables (the backgroundcan change from one usage to another; i.e. from a blue screen to a greenscreen).

[0037] Here is an algorithm to check if a pixel is of type background ina three component image (red, green and blue): -- the PIXEL variable isdefined as a structure RGB with  -- three fields: RED, GREEN and BLUE. FUNCTION IS-BACKGROUND (VAR PIXEL : RGB) -- check the red pixel  --RED-MIN is the minimum accepted red value  -- RED-MAX is the maximumaccepted red value IF PIXEL.RED < RED-MIN OR PIXEL.RED > RED-MAX THEN  RETURN FALSE END IF -- check the green pixel  IF PIXEL.GREEN <GREEN-MIN OR PIXEL.GREEN >  GREEN-MAX THEN   RETURN FALSE END IF-- check the blue pixel  IF PIXEL.BLUE < BLUE-MIN OR PIXEL.BLUE >BLUE-MAX  THEN   RETURN FALSE END IF -- if all the tests went throughproperly, the pixel is  -- a background pixel RETURN TRUE

[0038] Example of Ranges: Color Range of the background Red Green BlueBlue  0% to 15%  0% to 30% 75% to 100% Green  0% to 10% 75% to 100%  0%to 40% White 80% to 100% 80% to 100% 80% to 100%

[0039] With some optical devices, it will be better to check for morethan one single pixel. It is possible to check nine (9) pixels insteadof one to make sure it is a background pixel as shown in FIG. 2a. Therange used to check the pixels around may vary from the range used tocheck the middle pixel.

[0040] 5. Object

[0041] On the other hand, the object will be of all the colors which arenot the background color. The area to be searched can therefore bedefined with a positive (the object) and a negative (the background) setof pixels. A program could be done to transform the whole image in thisway. However to make it work in real time, it is necessary to check avery limited number of pixels. The algorithm below shows how this works.FUNCTION IS-OBJECT (VAR PIXEL : RGB) -- we return TRUE whenIS-BACKGROUND returns FALSE  -- and vice versa IF IS-BACKGROUND (PIXEL)THEN  RETURN FALSE ELSE  RETURN TRUE END IF

[0042] 6. Search

[0043] To search something in the screen, the optical mouse device needsto determine the sign of a pixel (positive or negative). This is donewith a call to the IS-BACKGROUND or IS-OBJECT functions as defined inthe Background and Object section above. A search algorithm can looklike this: FUNCTION FIND (VAR X, Y: INTEGER ;   VAR X-INCREMENT,Y-INCREMENT: INTEGER ;   VAR COUNT: INTEGER) -- COUNT determine themaximum number of iterations  -- we can use to find the object WHILECOUNT > 0  -- get the pixel at the position X, Y  PIXEL = READ-PIXEL (X,Y)  -- check if the pixel is part of the object IF IS-OBJECT (PIXEL)THEN  -- we found the object, stop the search and   -- return theposition at which the object   -- was found   RETURN X, Y  END IF -- theobject was not found yet  -- decrease the number of pixels to still bechecked COUNT: = COUNT − 1  -- define the next X coordinate to bechecked  -- (note: X-INCREMENT can be negative to go from right to left) X: = X + X-INCREMENT  -- define the next Y coordinate to be checked  --(note: Y-INCREMENT can be negative to go from bottom to top)  Y: = Y +Y-INCREMENT END WHILE -- return a position which can't be reached as a“no position”  -- in this example we use (−1, −1) RETURN −1, −1

[0044] In this FIND function we search an object. The opposite could beaccomplished simply by changing the IS-OBJECT function call inIS-BACKGROUND. The variable COUNT is used to know how many pixels willbe checked. The increment variables (X-INCREMENT and Y-INCREMENT) shouldbe used with values larger than one (1) whenever possible so as to avoidtesting many pixels.

[0045] 7. Body

[0046] The first part we will search is the body of the object. Thefollowing is an algorithm to search a standing person. However, theoptical mouse device may not be specifically based on such an object, itcould be applied to a flying plane, an object attached at a ceiling,etc.

[0047] The search can be accomplished with the FIND function as definedin the Search section above. It will check one pixel every 20 pixelsuntil the body is found or the whole screen has been searched. Theposition at which the body is found is put in variables for later use.PROCEDURE FIND-BODY -- define the STEP used to check pixels to find thebody ; -- the STEP needs to change depending on the area and therefore-- the size of the standing person ; -- 20 is a good value for a fullscreen area STEP = 20 -- compute the maximum number of left-rightiterations within -- the area to be searched MAX-TEST = (X-RIGHT −X-LEFT − LIMIT * 2)/STEP -- find the left position of the body FIND ( X-LEFT + LIMIT,  Y-BOTTOM − LIMIT,  STEP,  0,  MAX-TEST ) RETURNINGBODY-X-LEFT, BODY-Y-LEFT -- here, we should check whether the body wasfound -- when not, it's a special case and no body will be -- foundsearching from right to left so we could skip -- the following call --to find the right position of the body we could also -- search for thebackground going through the body instead -- of searching from the otherside of the screen (another -- FIND function would be necessary) -- findthe right position of the body FIND (  X-RIGHT − LIMIT,  Y-BOTTOM −LIMIT,  - STEP,  0,  MAX-TEST ) RETURNING BODY-X-RIGHT, BODY-Y-RIGHT

[0048] In this algorithm the following is defined:

[0049] The variable STEP: sets the number of pixels to be skipped oneach iteration; the use of a STEP value is important to make the searchfast; in this example we defined the step as 20, so if we startsearching at the position (10, 120), the second pixel tested will befound at the position (30, 120), the third at (50, 120), etc.

[0050] The variable MAX-TEST: it sets the number of pixels which can betested left-right; we use the X-LEFT and X-RIGHT coordinates as definedabove; the LIMIT * 2 would be:

[0051] . . . (LEFT-LIMIT+RIGHT-LIMIT)

[0052] if the limits were specific on each side.

[0053] The call to the FIND function to look for the left side: startsthe search at the position:

[0054] (X-LEFT+LIMIT, Y-BOTTOM−LIMIT)

[0055] which is at the bottom-left of the image. The function willcontinue the search increasing the X coordinate by STEP (Y is notchanged). The second pixel tested (if the body was not found yet) is:

[0056] (X-LEFT+LIMIT+STEP, Y-BOTTOM−LIMIT)

[0057] The FIND function tests pixels until the body is found, orMAX-TEST pixels are checked. The result is then saved in the BODY-X-LEFTand BODY-Y-LEFT variables.

[0058] The call to the FIND function to look for the right side: startsthe search at the position:

[0059] (X-RIGHT−LIMIT, Y-BOTTOM−LIMIT)

[0060] which is at the bottom-right of the image; the function willcontinue the search decreasing the X coordinate by STEP (Y is notchanged). The second pixel tested (if the body was not found yet) is:

[0061] (X-RIGHT−LIMIT−STEP, Y-BOTTOM−LIMIT)

[0062] The FIND function tests pixels until the body is found, orMAX-TEST pixels are checked. The result is then saved in theBODY-X-RIGHT and BODY-Y-RIGHT variables.

[0063] 8. Arms

[0064] Once the body has been found, a search for the arms 15 isconducted. See FIG. 3b. In the example shown, we start looking from thetop of the searched area to avoid any problems with the shadow theperson may display on the background. A too dark shadow would be seen asbeing positive (i.e. as part of the standing person).

[0065] The search of the arms is very similar to the search of the body.There is an example of algorithm to do so: PROCEDURE FIND-ARMS -- definethe STEP used to check pixels to find the arm ; -- it is smaller thanfor the body because arms are smaller -- the STEP needs to changedepending on the area and therefore -- the size of the standing personSTEP = 10 -- compute the maximum number of top-bottom iterations --within the area to be searched MAX-TEST = (Y-BOTTOM − Y-TOP − LIMIT *2)/STEP -- find the position of the left arm FIND (  BODY-X-LEFT − 20, Y-TOP + LIMIT,  0,  STEP,  MAX-TEST ) RETURNING ARM-X-LEFT, ARM-Y-LEFT-- find the position of the right arm FIND (  BODY-X-RIGHT + 20, Y-TOP + LIMIT,  0,  STEP,  MAX-TEST ) RETURNING ARM-X-RIGHT,ARM-Y-RIGHT

[0066] This algorithm is very similar to the one in the Body algorithmdescription above. Note that the step is now used to increment the Ycoordinate (X is not changed). There should be a test, which does notappear here, to make sure both sides of the body were found. The searchstarts from BODY-X-LEFT−20 and BODY-X-RIGHT+20 to avoid searching forshoulders and arms along the body (i.e. obviously not pointing). Thevalue of MAX-TEST could also be made smaller since the arms can not belower than a certain point; the computation could be:

[0067] SEARCH-HEIGHT=Y-BOTTOM−Y-TOP−LIMIT * 2 MAX-TEST=SEARCH-HEIGHT *0.80/ STEP

[0068] where we use only 80% of SEARCH-HEIGHT so the lowest 20% of thesearched area will not be checked for an arm. See FIG. 3b.

[0069] 9. Pointing Device

[0070] Once arms are found, we can start searching for the tip 25 of thearm as shown in FIG. 3c. This search is more complicated, and can beaccomplished checking inside the arm or around the arm. The algorithmpresented here goes around the arm. FUNCTION FIND-TIP (VAR X, Y: INTEGER;    VAR X-DIRECTION: INTEGER ;    VAR STEP: INTEGER) -- loop until aRETURN is reached LOOP FOREVER  SAVED-X := X  SAVED-Y := Y  -- movetoward the tip of the arm  X := X + STEP * X-DIRECTION  -- clip the Xposition on the right  IF X >= X-RIGHT − LIMIT THEN  -- no tip found (orfound on the border...)  RETURN −1, −1  END IF  -- clip the X positionon the left  IF X <= X-LEFT + LIMIT THEN  -- no tip found (or found onthe border...)  RETURN −1, −1  END IF  -- check the pixel, is it abackground pixel?  IF IS-BACKGROUND (X, Y) THEN  -- the loop ends withthe UNTIL  LOOP   -- move toward the bottom of the area   Y := Y + STEP  -- when we reach the bottom of the screen, we are “after”   -- the tipof the pointing device ; so check for that case   IF Y >= Y-BOTTOM −LIMIT THEN   RETURN SAVED-X, SAVED-Y   END IF  -- check going below UNTIL IS-OBJECT (X, Y)  -- keep the position outside the arm   Y := Y −STEP  ELSE  -- the pixel on the left isn't a background pixel  --therefore we need to go up instead of down to go  -- around it ;  -- theloop ends with the UNTIL  LOOP   -- move toward the top of the area   Y:= Y − STEP   -- check whether we reached the top of the screen   IF Y<= Y-TOP − LIMIT THEN   -- this case could be treated as a special caseby   -- strong optical mouse devices as an arm touching   -- the top ofthe screen ; yet in many cases the arm   -- can never reach such a place  RETURN −1, −1   END IF  UNTIL IS-BACKGROUND (X, Y)  END IF  -- here weknow we have a background pixel at (X, Y) END LOOP

[0071] This first algorithm can be used to find the tip of one arm. Thedirection (X-DIRECTION) parameter can be used to search on the left (−1)or on the right (1) as described below. PROCEDURE FIND-POINTERS --search for the left pointing device with 5 pixels precision --(direction set to −1 and step to 5) FIND-TIP (  ARM-X-LEFT,  ARM-Y-LEFT− 10,  −1,  5) RETURNING TIP-X-LEFT, TIP-Y-LEFT -- search for the leftpointing device with 1 pixel precision -- (direction set to −1 and stepto 1) FIND-TIP (  TIP-X-LEFT − 5,  TIP-Y-LEFT − 5,  −1,  1) RETURNINGTIP-X-LEFT, TIP-Y-LEFT -- search for the right pointing device with 5pixels precision -- (direction set to 1 and step to 5) FIND-TIP ( ARM-X-RIGHT,  ARM-Y-RIGHT − 10,  1,  5) RETURNING TIP-X-RIGHT,TIP-Y-RIGHT -- search for the right pointing device with 1 pixelprecision -- (direction set to 1 and step to 1) FIND-TIP (  TIP-X-RIGHT− 5,  TIP-Y-RIGHT − 5,  1,  1) RETURNING TIP-X-RIGHT, TOP-Y-RIGHT

[0072] This last algorithm calls the FIND-TIP function which determinesthe tip of each arm. FIND-ARMS should check whether an arm was foundbefore to search for its tips. The search is repeated twice. See FIG.3e: (1) with a step of 5 which returns a point near the tip at about 5pixels away and (2) with a step of 1 which returns a point next to thetip (i.e. which is at most 1 pixel away). This technique is only toenable to processing to occur substantially real time. Other techniquescan be used to close up on the tip instead of a new call to FIND-TIP.

[0073] 10. Left or Right?

[0074] When two pointing devices are found (one on the right 15 and oneon the left 27), a simple distance algorithm is used to determine whichone will be used. See FIG. 3d. Other algorithms could be used, likealways force the left side or the right side to be picked. FUNCTIONSELECT-TIP IF BODY-X-LEFT − TIP-X-LEFT > TIP-X-RIGHT − BODY-X-RIGHT THEN RETURN TIP-X-LEFT, TIP-Y-LEFT ELSE  RETURN TIP-X-RIGHT, TIP-Y-RIGHT ENDIF

[0075] Since it is known which variable which variable has the largest Xcoordinate, there is no need to use an absolute value function.

[0076] 11. Body on the Border

[0077] There are two special cases when the body 13 is on the border ofthe screen. One when the body is on the right See FIG. 4. and one whenthe body is on the left. In both cases, a search is conducted for onearm 15. The search will be exactly the same as described above in theBody, Arms and Pointing Device algorithm.

[0078] The Body algorithm will usually return an unusable position forthe search of the body when the person is standing on a border. Aresulting position which is equal to the startup or the first position,can be declared as being an invalid position (i.e. the body is on thatside).

[0079] The handling of this special case just means that one side of thebody may not exist. Therefore, the Arms algorithm needs to search onlyone of the two arms and set the other arm position to a ‘no position’status. Similarly, the Pointing Device algorithm needs to search the tipof only one arm.

[0080] An algorithm could look like this:  -- do some setup  ...  --search for the body  FIND-BODY  -- check if the left position is correct -- (instead of 25, we should use the body STEP value)  IF BODY-X-LEFT<X-LEFT + LIMIT + 25 THEN   -- not a correct position   -- we didn'tfind the body here   BODY-X-LEFT := −1   BODY-Y-LEFT := −1  END IF  --check if the right position is correct IF BODY-X-RIGHT <Y-RIGHT − LIMIT− 25 THEN   -- not a correct position   --  we didn't find the body here  BODY-X-RIGHT := −1   BODY-Y-RIGHT := −1  END IF  -- check if one sidewasn't found  -- search for only one arm  IF BODY-X-LEFT = −1 ANDBODY-Y-LEFT = −1 THEN  IF BODY-X-RIGHT = −1 AND BODY-Y-RIGHT = −1 THEN   -- both sides weren't found    -- check for flying arms ;    -- (seenext point)    FLYING-ARMS   ELSE    -- the left side wasn't found    --search the right arm only    FIND-RIGHT-ARM   END IF   RETURN  END IF IF BODY-X-RIGHT = −1 AND BODY-Y-RIGHT = −1 THEN   -- the right sidewasn't found   -- search the left arm only   FIND-LEFT-ARM   RETURN  ENDIF

[0081] 12. No Body Found or the “Flying Arms”

[0082] This special case happens whenever the person is out of the videofield but is still pointing to something within the video field as shownin FIG. 5. This special case occurs whenever the body 13 is not found atall in the area to be searched.

[0083] Once this special case is determined (like in the Body or theBorder algorithm), a search for an arm (also called a flying arm) on theleft is performed as if the body was just further left of the limit.Similarly, and if nothing is found on the left (limitation which isgiven here because a person cannot be on both side of a screen which isspecific to this example of the optical mouse device) a check on theright is performed as if the body was just further right of the limit.

[0084] The following algorithm shows an example to do so:

[0085] PROCEDURE FLYING-ARMS

[0086] simulate a body on the right of the screenBODY-X-LEFT=X-RIGHT−LIMIT BODY-Y-LEFT=Y-BOTTOM−LIMIT

[0087] simulate a body on the left of the screenBODY-X-RIGHT=X-LEFT+LIMIT BODY-Y-RIGHT=Y-BOTTOM−LIMIT

[0088] search for the arms FIND-ARMS

[0089] search for the pointing device FIND-POINTERS

[0090] The FLYING-ARMS procedure sets all the body coordinates to fakepositions as if a body had been found. Thus the FIND-ARMS procedure canbe called to perform a search for two arms.

[0091] 13. Nothing Found

[0092] In the event nothing is found (the area is only negative) theoptical mouse device will return a “no position” value.

[0093] Because of the way it works, the optical mouse could be done tosearch for several areas in each image. This way, more than one pointingdevice could be found and managed; i.e. two persons could appear in theimage: one on the left and one on the right.

1. A method for finding a pointing element within a digital imagecomprising the steps of: a) receiving a digitized image including abackground and an object to be tracked, said background and said objecthaving a different color; b) defining a rectangle within said imageincluding at least a portion of said object; c) conducting a searchwithin the defined rectangle starting at a position adjacent a firstside of a bottom portion of said rectangle and continuing in a directiontowards a second side of said rectangle until said object is located andstoring a first x value corresponding thereto; d) conducting a secondsearch within the defined rectangle starting at a position adjacent saidsecond side of said bottom portion of said rectangle and continuing in adirection towards said first side of said rectangle until said object islocated and storing a second x value corresponding thereto; e)conducting a third search within the defined rectangle starting at aposition adjacent a top portion of said rectangle and offset from saidfirst x value and continuing in a direction towards said bottom untilsaid object is located and storing a first y value correspondingthereto; f) conducting a fourth search within the defined rectanglestarting at a position adjacent said top portion of said rectangle andoffset from said second x value and continuing in a direction towardssaid bottom until said object is located and storing a second y valuecorresponding thereto; g) conducting a fifth search using the storedfirst x and second x and first y and second y positions to find a pointrepresenting one of a leftmost and a rightmost position of said objectand storing x and y coordinates corresponding to said found point.